Vertex Model Deployment
In this instruction, you will learn how to create VertexAI model in Google Cloud Platform and use it in AI DIAL config.
Table of Contents
Prerequisites
- Active Google Cloud project
- Enabled billing for the project
Refer to Google Cloud Documentation to learn how to create an account and enable billing.
Step 1: Configuring the AI Model
Request Access to Models
-
Log into your Google Cloud account.
-
In the navigation panel on the left, in APIs & Services, select Enable APIs and Services.
-
In APIs and Services click + Enable APIs and Services to access the API library.
-
In the search bar, type Vertex AI API and select the Vertex AI API panel when it appears in search results.
-
Click Enable to turn on the Vertex AI API for your Google Cloud project.
Step 2: Get Access to AI Model
Create a Service Account
Configure GCP Service Account and Get JSON Key
To communicate with VertexAI models, it is necessary to have a service account.
To create a Service Account:
-
In your Google Cloud account, in the main navigation menu find IAM & Admin and navigate to Service Accounts.
-
To create a new service account, click + Create Service Account and fill in the details for your new service account:
- Fill in the Service account details.
- In the next step Grant this service account access to project, add Vertex AI Custom Code Service Agent role. Refer to GCP Documentation to learn more.
- Click Done to complete.
-
The new service account appears on the Service Account page. Click it to view the details:
- In KEYS, create a key for this service account and download it in JSON format.
Configure Kubernetes Service Account
In case your cluster is located at GCP, the best practise for using VertexAI is to assign a GCP IAM service account to Kubernetes Service Account. You can do this via Workload Identity Federation for GKE.
Refer to GCP Documentation to learn how to configure a Workload Identity Federation for GKE.
Step 3: Add Model to AI DIAL
Add Model to AI DIAL Core Config
To deploy a model to AI DIAL, it is necessary to add it to config and configure an adapter for it.
Add your model with its parameters in the models
section.
Refer to AI DIAL Core Configuration to view an example.
Refer to Configure core config to view the configuration of AI DIAL core parameters in the helm-based installation.
Configure AI DIAL Adapter
To work with models, we use applications called Adapters. You can configure VertexAI Adapter via environment variables.
Refer to Adapter for Vertex to view documentation for a Vertex AI DIAL Adapter.
Use GCP Service Account with JSON Key
The JSON file with your GCP key should be mounted to a pod as a file. Please, use the most suitable way to perform it.
Example of mounting JSON key using secrets:
vertexai:
enabled: true
env:
DEFAULT_REGION: "your-region"
GOOGLE_APPLICATION_CREDENTIALS: "/mnt/secrets-store/gcp-ai-key"
GCP_PROJECT_ID: "your-project-id"
secrets:
gcp-ai-key: |
{
"type": "service_account",
...
"universe_domain": "googleapis.com"
}
extraVolumes:
- name: key-file
secret:
secretName: '{{ template "dialExtension.names.fullname" . }}'
items:
- key: gcp-ai-key
path: gcp-ai-key
extraVolumeMounts:
- name: key-file
mountPath: "/mnt/secrets-store"
readOnly: true
Use GCP Service Account with Workload Identity Federation for GKE
Before taking this step, configure Authenticate to Google Cloud APIs from GKE workloads.
In this scenario, Kubernetes Service Account is linked to GCP IAM service account (your-sa-id).
vertexai:
enabled: true
serviceAccount:
create: true
annotations:
iam.gke.io/gcp-service-account: your-sa-id@your-project-id.iam.gserviceaccount.com
env:
DIAL_URL: "http://dial-core"
GCP_PROJECT_ID: "your-project-id"
DEFAULT_REGION: "your-region"